As we think about time-averaging and taphonomic bias in the sedimentary record, we can use R tools to visualize these concepts. We talked last week about basic univariate statistics like means, medians, and modes. We familiarized ourselves a bit more with apply(), table(), and summary() to derive insights into the distribution of our data. We also introduced variance and standard deviations.
This week, we develop our understanding of variance, deviation, and standard deviations a bit more and will learn how these descriptions of variance can help us work through some examples of how taphonomy can transform fossil communities, and how we might deal with it. We do this because the sedimentary record contains fossils that are biased by, first, biological production and, second, differential preservation. As we try to bridge ecological and paleoecoloical scales, we must translate fossil counts into landscape distributions, population estimates, or even past environmental conditions. We can understand biological production and sedimentation more clearly as sampling processes. This means that past biological populations can never be compared with our estimates and that we are always dealing with a biased sampling of past communities, to some degree.
Standard deviations help us characterize variation in either a sample or a population of things. It is derived from some related values that are worth knowing.
Methods for describing variance.
mad(fake_core$Cyperaceae.undiff.)
## [1] 13.3434
mad_cyp = sum( abs(fake_core$Cyperaceae.undiff. - median(fake_core$Cyperaceae.undiff.) ) ) / length(fake_core$Cyperaceae.undiff.)
mad_cyp
## [1] 8.86
Above, we get some different results between R and a manual calculation because the mad() function uses a constant to adjust for “asmptotically normal consistency” (see help(mad)). Notice first that we can use mean or median to as a basis for absolute deviation for our vectors. This can be set with the “center =” and “constant =” arguments in mad() or within our manual calculation, we take the mean value of the absolute differences between the individual observations and their mean. This looks different from the formula above because I’ve subsumed the “sum()” and “/length(x)” functions by calling the entire calculation under “mean()”. If we calculate the value manually using the median (second half of code below), we get the same value as the “mad()” function, using a constant of 1.
mad(fake_core$Cyperaceae.undiff., center = mean(fake_core$Cyperaceae.undiff.), constant = 1)
## [1] 8.74
mad_cyp = mean(abs(fake_core$Cyperaceae.undiff. - mean(fake_core$Cyperaceae.undiff.)))
mad_cyp
## [1] 8.8652
#Now we calculate the median average deviation.
mad_cyp = median(abs(fake_core$Cyperaceae.undiff - median(fake_core$Cyperaceae.undiff.)))
mad_cyp
## [1] 9
mad(fake_core$Cyperaceae.undiff., constant = 1)
## [1] 9
Here we are working with deviation from the mean or median. This is a logical enough way to express deviation in a set of numbers, but its applications and interpretation are limited (ex: how easy is it to compare between variables?). However, it does set the foundation of how we calculate variance, which is the sum of the squared deviations from the mean divided by the number of observations. For samples, we calculate the latter as the number of observations less one (N - 1). Base R has a function for variance “var()”, but we can also calculate it manually.
var(fake_core$Cyperaceae.undiff.)
## [1] 105.164
var_cyp = mean((fake_core$Cyperaceae.undiff. - mean(fake_core$Cyperaceae.undiff.))^2)
#Perhaps R calculates variance for a sample and not a population...
var_cyp = sum((fake_core$Cyperaceae.undiff. - mean(fake_core$Cyperaceae.undiff.))^2)/
(length(fake_core$Cyperaceae.undiff.)-1) #Note that we're taking N - 1 here.
var(fake_core$Cyperaceae.undiff.)
## [1] 105.164
var_cyp
## [1] 105.164
In this case, we’ve finally gotten identical values when compared with the base R function. But what does variance mean and how do we interpret it? R lets us generate a lot of fake and specifically structured data. Let’s look at some general features of variance by looking at systematically modified input.
my_df = sapply(1:10, function(x){ #Using function to create progressively larger runs of numbers that are equally spaced.
seq(x, x*100, x)
})
apply(my_df, 2, var)
## [1] 841.6667 3366.6667 7575.0000 13466.6667 21041.6667 30300.0000
## [7] 41241.6667 53866.6667 68175.0000 84166.6667
plot(apply(my_df, 2, var), type = "o", pch = 21, bg = "orange")
Naturally, as the range of values gets larger the variance goes up. While the range grows by 100s, the variance does not grow linearly. Let’s look at a longer range of numbers and see how variance and the overall spread of the data interact.
my_df = sapply(1:100, function(x){ #Using function to create progressively larger runs of numbers that are equally spaced.
seq(x, x*100, x)
})
apply(my_df, 2, var)
## [1] 841.6667 3366.6667 7575.0000 13466.6667 21041.6667
## [6] 30300.0000 41241.6667 53866.6667 68175.0000 84166.6667
## [11] 101841.6667 121200.0000 142241.6667 164966.6667 189375.0000
## [16] 215466.6667 243241.6667 272700.0000 303841.6667 336666.6667
## [21] 371175.0000 407366.6667 445241.6667 484800.0000 526041.6667
## [26] 568966.6667 613575.0000 659866.6667 707841.6667 757500.0000
## [31] 808841.6667 861866.6667 916575.0000 972966.6667 1031041.6667
## [36] 1090800.0000 1152241.6667 1215366.6667 1280175.0000 1346666.6667
## [41] 1414841.6667 1484700.0000 1556241.6667 1629466.6667 1704375.0000
## [46] 1780966.6667 1859241.6667 1939200.0000 2020841.6667 2104166.6667
## [51] 2189175.0000 2275866.6667 2364241.6667 2454300.0000 2546041.6667
## [56] 2639466.6667 2734575.0000 2831366.6667 2929841.6667 3030000.0000
## [61] 3131841.6667 3235366.6667 3340575.0000 3447466.6667 3556041.6667
## [66] 3666300.0000 3778241.6667 3891866.6667 4007175.0000 4124166.6667
## [71] 4242841.6667 4363200.0000 4485241.6667 4608966.6667 4734375.0000
## [76] 4861466.6667 4990241.6667 5120700.0000 5252841.6667 5386666.6667
## [81] 5522175.0000 5659366.6667 5798241.6667 5938800.0000 6081041.6667
## [86] 6224966.6667 6370575.0000 6517866.6667 6666841.6667 6817500.0000
## [91] 6969841.6667 7123866.6667 7279575.0000 7436966.6667 7596041.6667
## [96] 7756800.0000 7919241.6667 8083366.6667 8249175.0000 8416666.6667
plot(apply(my_df, 2, var),
type = "o",
pch = 21,
bg = "orange")
lines(apply(my_df, 2, mad)*1000) #Here we're adding the MAD values for the same dataset, multiplied by 1000 to make them visible.
Here, we see that the relationship between variance and the overall dispersion becomes linear after a total dispersion of about 60,000, which gives us a variance of about 3 million. So, we see here that populations which are structurally identical but which vary in their total quantity are going to have different variances. Specifically, the larger the total counts, the more variance we expect to find. We can also see that the median average deviation overestimates variance up to about 40,000 then continually under-estimates it afterwards. We will explore more about variance and population structure (i.e. composite frequency distribuion of variables in population) once we have mastered some more univariate statistics.
We can learn a bit more by creating longer lists of repeating numbers, which will show us some of the impacts of sample size. Below, we make a short list of numbers with a mean and median of 9, then we repeat that list of numbers 100 times, taking the variance and median absolute deviation. The code below gives us three plots: one showing variance/MAD across the 100 repetitions of the number string; one showing a histogram of this data, and another showing frequency distributions as displayed using a combination of “plot()” and “table()” methods.
test_range = c(5:13)
plot_var = sapply(1:100, function(x){
var(rep(test_range, x))
})
plot_mad = sapply(1:100, function(x){
mad(rep(test_range, x))
})
par(mfrow = c(1, 3))
plot(plot_var, ylim = c(0,10), pch = 21, bg = "orange")
lines(plot_mad)
hist(rep(test_range, 100))
plot(table(rep(test_range,100)))
par(mfrow = c(1, 1))
Regarding variance, we see a minor impact of the total sample size on our measurement of variance, but it quickly approaches an asymptote after being doubled three or four times. The distribution of this data is even and we see that increasing the number of observations (so long as they’re the same numbers) does not impact our estimation of the mean or median. Will this hold for a set of numbers where the frequency distribution isn’t even?
norm_range = #Using <shift+enter>, we can already make a visualization of the counts...
c(5, 5,
6, 6, 6,
7, 7, 7, 7,
8, 8, 8, 8, 8,
9, 9, 9, 9, 9, 9,
10, 10, 10, 10, 10,
11, 11, 11, 11,
12, 12, 12,
13, 13)
plot_var = sapply(1:100, function(x){
var(rep(norm_range, x))
})
plot_mad = sapply(1:100, function(x){
mad(rep(norm_range, x))
})
par(mfrow = c(1, 2))
plot(plot_var, ylim = c(floor(min(plot_mad)) - 0.5, ceiling(max(plot_var)) +0.5), pch = 21, bg = "orange")
lines(plot_mad)
plot(table(rep(norm_range,100)))
par(mfrow = c(1, 1))
Let’s try this with a v-shaped distribution.
v_range = c(
5, 5, 5, 5, 5, 5,
6, 6, 6, 6, 6,
7, 7, 7, 7,
8, 8, 8,
9, 9,
10, 10, 10,
11, 11, 11, 11,
12, 12, 12, 12, 12,
13, 13, 13, 13, 13, 13)
plot_var = sapply(1:100, function(x){
var(rep(v_range, x))
})
plot_mad = sapply(1:100, function(x){
mad(rep(v_range, x))
})
par(mfrow = c(1, 2))
plot(plot_var, ylim = c(floor(min(plot_mad)) - 0.5, ceiling(max(plot_var)) +0.5), pch = 21, bg = "orange")
lines(plot_mad)
plot(table(rep(v_range,100)))
par(mfrow = c(1, 1))
The replication structure we’re using here will serve us in the future. Let’s codify the repetitions and plotting into a function to save us all this copy-pasting.
stat_lines = function(x,#Creating a function called "stat_lines" it expects some vector of data "x" and the following arguments
times = length(x), #Setting this to "length(x)" by default, so calling this won't always be necessary. But we can add our own numbers!
title = NA){ #Setting up a title object, defaulting to NA
plot_u = sapply(1:times, function(y){ #Here's that fast custom function that creates means for the repeated set of numbers.
mean(rep(x, y))
})
plot_med = sapply(1:times, function(y){ #Same, but for the median.
median(rep(x, y))
})
plot_var = sapply(1:times, function(y){ #Same, but for variance.
var(rep(x, y))
})
plot_mad = sapply(1:times, function(y){ #Same, but for median absolute deviation
mad(rep(x, y))
})
stat_labels = c("mean", "median", "variance", "median abs. dev.")
plot_stats = data.frame(plot_u, plot_med, plot_var, plot_mad)
plot(0, 0, xlim = c(0, times), ylim = c(0, max(plot_stats)+1), pch = NA, xlab = "N repeats", ylab = "value", main =paste0(title," Repeated Statistics"))
for(i in 1:ncol(plot_stats)){
points(plot_stats[, i], type = "o", pch = 21, lty = i, bg = i, cex = 0.7)
text(times*0.05+(i*(0.2*times)), min(plot_stats[, i])+(0.04*max(plot_stats)), labels = stat_labels[i])
}
}
stat_lines(v_range, title = "V-Shaped Range")
This makes plotting this much easier, allowing us to make some quick comparisons between our two small datasets, which we are making much larger using the “rep()” function.
par(mfrow = c(1,2))
stat_lines(v_range, title = "V-Shaped Range", times = 100)
stat_lines(norm_range, title = "Normal Range", times = 100)
par(mfrow = c(1,1))
We can apply this same approach to some of the fake core data.
par(mfrow = c(2,2))
stat_lines(fake_core$Cyperaceae.undiff., title = "Cyperaceae")
stat_lines(fake_core$Typha, title = "Typha")
stat_lines(fake_core$Alchornea, title = "Alchornea")
stat_lines(fake_core$Guibourtia.demeusei, title = "Guibourtia")
par(mfrow = c(1,1))
What all of this demonstrates is that:
Thus, to standardize our estimate of deviation, we take the square root of variance.
sd(fake_core$Cyperaceae.undiff.)
## [1] 10.25495
sd_cyp = sqrt(var(fake_core$Cyperaceae.undiff.))
sd_cyp
## [1] 10.25495
Let’s see how it performs when we systematically manipulate the data again.
par(mfrow = c(1, 2))
my_df = sapply(1:10, function(x){ #Using function to create progressively larger runs of numbers that are equally spaced.
seq(x, x*100, x)
})
#apply(my_df, 2, sd)
plot(apply(my_df, 2, sd), type = "o", pch = 21, bg = "orange")
lines(apply(my_df, 2, mad))
my_df = sapply(1:1000, function(x){
seq(x, x*100, x)
})
plot(apply(my_df, 2, sd), type = "o", pch = 21, bg = "orange")
lines(apply(my_df, 2, mad))
par(mfrow = c(1, 1))
Let’s add standard deviations to our basic plots and remove the variance.
stat_lines = function(x,#Creating a function called "stat_lines" it expects some vector of data "x" and the following arguments
times = length(x), #Setting this to "length(x)" by default, so calling this won't always be necessary. But we can add our own numbers!
title = NA){ #Setting up a title object, defaulting to NA
plot_u = sapply(1:times, function(y){ #Here's that fast custom function that creates means for the repeated set of numbers.
mean(rep(x, y))
})
plot_med = sapply(1:times, function(y){ #Same, but for the median.
median(rep(x, y))
})
#plot_var = sapply(1:times, function(y){ #Same, but for variance. #We can keep our code and just hash this out.
# var(rep(x, y))
#})
plot_mad = sapply(1:times, function(y){ #Same, but for median absolute deviation
mad(rep(x, y))
})
plot_sd = sapply(1:times, function(y){
sd(rep(x, y))
})
stat_labels = c("mean", "median", "median abs. dev.", "SD") #Because we automate everything below, all we have to do is modify the names.
plot_stats = data.frame(plot_u, plot_med, plot_mad, plot_sd) #We're using the plot_stats dataframe below, so we modify the entry here.
plot(0, 0, xlim = c(0, times), ylim = c(0, max(plot_stats)+1), pch = NA, xlab = "N repeats", ylab = "value", main =paste0(title," Repeated Statistics"))
for(i in 1:ncol(plot_stats)){
points(plot_stats[, i], type = "o", pch = 21, lty = i, bg = i, cex = 0.7)
text((times/ncol(plot_stats)*i), min(plot_stats[, i])+(0.04*max(plot_stats)), labels = stat_labels[i]) #Changing text plotting to fit new variable.
}
}
Let’s plot our selected taxa again.
par(mfrow = c(2,2))
stat_lines(fake_core$Cyperaceae.undiff., title = "Cyperaceae")
stat_lines(fake_core$Typha, title = "Typha")
stat_lines(fake_core$Alchornea, title = "Alchornea")
stat_lines(fake_core$Guibourtia.demeusei, title = "Guibourtia")
par(mfrow = c(1,1))
At last, we see that the SD is slightly more conservative than median absolute deviations. Because we’ve taken variance, which is sensitive to total dispersion, and standardized it by differences from the mean we end up with a value that is easier to interpret. This is why most of us understand some basic principles of standard deviations without knowing why this value has these characteristics. Thus, standard deviations are the typical distance from the mean that characterizes a set of values.
Armed with our newfound knowledge of variance, we can make good use of some powerful R functions:
We can also verify this with some population thinking using the function “rnorm()” which lets us create a normally-distributed sample using arguments for the number of random numbers, mean, and standard deviation.
norm_vector = rnorm(n = 100, mean = 50, sd = 10)
hist(norm_vector)
Let’s see how this distribution changes when we systematically increase the sample size.
par(mfrow = c(3,2)) #We're making six plots
sapply(1:6, function(x){ #Using a custom function six times
hist(rnorm(n = 10^x, #Call histogram for a normally distributed set of random numbers, increasing by orders of magnitude (10^i)
mean = 50, #Set the mean of the random values
sd = 10), #Set the standard deviation
breaks = 50, #Breaks for histogram
main = paste0("Histogram for n = ", 10^x)) #Title for plots
})
## [,1]
## breaks Numeric,57
## counts Integer,56
## density Numeric,56
## mids Numeric,56
## xname "rnorm(n = 10^x, mean = 50, sd = 10)"
## equidist TRUE
## [,2]
## breaks Integer,49
## counts Integer,48
## density Numeric,48
## mids Numeric,48
## xname "rnorm(n = 10^x, mean = 50, sd = 10)"
## equidist TRUE
## [,3]
## breaks Integer,63
## counts Integer,62
## density Numeric,62
## mids Numeric,62
## xname "rnorm(n = 10^x, mean = 50, sd = 10)"
## equidist TRUE
## [,4]
## breaks Integer,39
## counts Integer,38
## density Numeric,38
## mids Numeric,38
## xname "rnorm(n = 10^x, mean = 50, sd = 10)"
## equidist TRUE
## [,5]
## breaks Integer,46
## counts Integer,45
## density Numeric,45
## mids Numeric,45
## xname "rnorm(n = 10^x, mean = 50, sd = 10)"
## equidist TRUE
## [,6]
## breaks Numeric,49
## counts Integer,48
## density Numeric,48
## mids Numeric,48
## xname "rnorm(n = 10^x, mean = 50, sd = 10)"
## equidist TRUE
par(mfrow = c(1,1))
R lets us create data using rbeta(), which generates some skewed data. There’s two arguments for shape (“shape1” and “shape2”). As the two numbers get closer together, the distribution becomes more normal.
par(mfrow = c(3,2))
sapply(exp(seq(0.2, 1.2, 0.2)), function(x){ #Using a custom function six times
hist(rbeta(10000, #Call histogram for a normally distributed set of random numbers, increasing by orders of magnitude (10^i)
x, #Setting shape values
10), #Setting shape values
breaks = 50, #Breaks for histogram
main = paste0("Histogram for Skewed Data")) #Title for plots
})
## [,1] [,2] [,3]
## breaks Numeric,69 Numeric,68 Numeric,68
## counts Integer,68 Integer,67 Integer,67
## density Numeric,68 Numeric,67 Numeric,67
## mids Numeric,68 Numeric,67 Numeric,67
## xname "rbeta(10000, x, 10)" "rbeta(10000, x, 10)" "rbeta(10000, x, 10)"
## equidist TRUE TRUE TRUE
## [,4] [,5] [,6]
## breaks Numeric,37 Numeric,37 Numeric,39
## counts Integer,36 Integer,36 Integer,38
## density Numeric,36 Numeric,36 Numeric,38
## mids Numeric,36 Numeric,36 Numeric,38
## xname "rbeta(10000, x, 10)" "rbeta(10000, x, 10)" "rbeta(10000, x, 10)"
## equidist TRUE TRUE TRUE
par(mfrow = c(1,1))
We modify the shape arguments to create skews in the other direction.
par(mfrow = c(3,2))
sapply(exp(seq(0.2, 1.2, 0.2)), function(x){ #Using a custom function six times
hist(rbeta(10000, #Call histogram for random numbers drawn from beta distribution of 10000 numbers
10, #Setting shape values
x), #Setting shape values
breaks = 50, #Breaks for histogram
main = paste0("Histogram for Skewed Data")) #Title for plots
})
## [,1] [,2] [,3]
## breaks Numeric,61 Numeric,39 Numeric,70
## counts Integer,60 Integer,38 Integer,69
## density Numeric,60 Numeric,38 Numeric,69
## mids Numeric,60 Numeric,38 Numeric,69
## xname "rbeta(10000, 10, x)" "rbeta(10000, 10, x)" "rbeta(10000, 10, x)"
## equidist TRUE TRUE TRUE
## [,4] [,5] [,6]
## breaks Numeric,68 Numeric,39 Numeric,38
## counts Integer,67 Integer,38 Integer,37
## density Numeric,67 Numeric,38 Numeric,37
## mids Numeric,67 Numeric,38 Numeric,37
## xname "rbeta(10000, 10, x)" "rbeta(10000, 10, x)" "rbeta(10000, 10, x)"
## equidist TRUE TRUE TRUE
par(mfrow = c(1,1))
If we shrink our stable shape value to less than one, we approach exponential distributions and we don’t approach normalcy as the values become even.
par(mfrow = c(3,2))
sapply(1:6, function(x){ #Using a custom function six times
hist(rbeta(10000, #Call histogram for random numbers drawn from beta distribution of 10000 numbers
0.2, #Setting shape values
x), #Setting shape values
breaks = 50, #Breaks for histogram
main = paste0("Histogram for Skewed Data")) #Title for plots
})
## [,1] [,2] [,3]
## breaks Numeric,51 Numeric,50 Numeric,47
## counts Integer,50 Integer,49 Integer,46
## density Numeric,50 Numeric,49 Numeric,46
## mids Numeric,50 Numeric,49 Numeric,46
## xname "rbeta(10000, 0.2, x)" "rbeta(10000, 0.2, x)" "rbeta(10000, 0.2, x)"
## equidist TRUE TRUE TRUE
## [,4] [,5] [,6]
## breaks Numeric,40 Numeric,38 Numeric,40
## counts Integer,39 Integer,37 Integer,39
## density Numeric,39 Numeric,37 Numeric,39
## mids Numeric,39 Numeric,37 Numeric,39
## xname "rbeta(10000, 0.2, x)" "rbeta(10000, 0.2, x)" "rbeta(10000, 0.2, x)"
## equidist TRUE TRUE TRUE
par(mfrow = c(1,1))
Being able to manipulate distributions in this way can help us add specific parameters for some of our expectations for how production and preservation may bias our estimates of past community composition and rates of past ecological change. Let’s look at our basic statistics again, using skewed distributions. We will build four vectors with 1000 random values drawn from four different distributions.
my_log <- rlogis(1000)
my_exp <- rexp(1000)
my_nrm <- rnorm(1000)
my_skw <- rbeta(1000, 10, 2)
my_dist <- data.frame(my_log, my_exp, my_nrm, my_skw)
my_hist <- function(x, data){
hist(data[,x], breaks = 50, main = colnames(data)[x])
}
invisible(lapply(1:ncol(my_dist), my_hist, data = my_dist))
boxplot(my_dist, horizontal = TRUE, las = 1)
We encounter a problem with the above - the ranges of the numbers are rather different, and the “rlogis()” function is giving us an exponential relationship between the mean and the min/max, but it is still more or less normally distributed.
my_df = sapply(1:1000, function(x){
rnorm(100, mean = x*10, sd = 10)
})
apply(my_df, 2, var)
## [1] 91.91677 113.33116 90.87983 105.54010 89.65999 126.64008 92.66892
## [8] 116.21103 96.14015 77.37445 102.21696 106.94374 100.09283 97.92641
## [15] 112.80092 93.85903 115.91272 88.29658 114.52904 97.21918 109.31468
## [22] 82.40590 85.39635 90.08322 116.05195 101.36645 102.31383 112.52049
## [29] 85.11235 62.69520 100.28855 98.90478 98.87331 92.93104 96.11734
## [36] 83.87380 102.99480 103.22118 102.00706 124.48385 106.12914 118.02170
## [43] 98.98906 98.31032 92.68570 58.07111 105.09322 100.55060 108.04310
## [50] 116.17138 97.98678 109.49027 93.37430 99.22040 91.10845 114.42871
## [57] 97.31955 95.35936 86.69679 74.40185 99.96501 112.26177 85.51171
## [64] 116.36703 115.84678 106.31050 97.87560 117.68964 102.18453 102.30189
## [71] 83.52007 85.67294 77.97993 88.36761 111.38392 121.10285 92.52087
## [78] 92.58047 135.59310 103.10893 88.59419 108.98279 105.61653 91.15076
## [85] 99.64295 102.29959 104.89414 100.73256 110.13131 107.62796 110.21360
## [92] 80.45781 89.99015 75.60344 119.03857 87.42713 74.03848 99.04111
## [99] 101.74934 82.18198 68.85868 86.61543 90.91115 107.29842 113.42334
## [106] 96.49120 115.72708 90.98320 101.46560 93.46258 111.21175 110.01535
## [113] 90.97217 91.25447 122.39403 87.98576 115.03348 93.07930 93.23830
## [120] 90.63780 111.93911 108.12084 94.45952 71.02356 97.76443 96.51735
## [127] 90.78035 73.53062 106.22247 84.34674 104.07433 89.63699 127.96867
## [134] 100.12469 93.23444 90.07238 118.55309 93.15491 107.73563 96.09155
## [141] 107.63396 88.17151 101.68566 83.33000 96.55262 97.14835 93.25641
## [148] 79.67622 83.61698 86.83761 84.64529 90.99176 89.00304 102.09829
## [155] 110.56458 88.65657 80.60105 93.15203 117.59661 75.98655 98.41306
## [162] 76.80687 109.48840 98.41305 80.08308 96.93528 103.35120 94.49921
## [169] 99.18178 80.63358 86.52644 98.14949 101.46245 96.93244 120.48913
## [176] 94.20548 78.14552 75.25783 90.29916 114.97480 89.91578 104.23132
## [183] 107.91173 118.11974 102.79236 89.53108 96.28781 149.89658 102.89619
## [190] 108.75251 110.19926 157.69473 85.02257 81.83225 73.71113 103.80227
## [197] 101.78947 93.54037 106.72553 90.79949 118.99382 80.16633 101.68532
## [204] 86.40862 85.36634 78.60793 98.49513 101.28619 114.01310 96.97242
## [211] 92.65145 92.77225 78.84692 123.69773 84.60589 88.44627 107.06678
## [218] 121.25711 77.75759 133.48235 117.19192 95.17851 98.00784 80.62145
## [225] 70.48558 94.44980 90.70240 117.90323 93.55612 94.34625 100.73534
## [232] 104.74090 108.55430 83.75399 99.69598 74.41647 98.08401 119.78429
## [239] 114.20353 93.75780 95.57012 94.73474 116.95312 113.43784 87.39458
## [246] 102.95173 91.86855 127.46075 94.80465 111.92046 98.64431 103.20610
## [253] 112.78447 88.98835 92.89926 87.37615 87.37727 92.73474 100.58589
## [260] 97.61627 106.41565 90.12683 83.64311 99.98275 119.26403 99.24104
## [267] 74.69380 114.44538 76.70604 95.41430 72.42700 92.49313 71.24710
## [274] 100.65515 107.52312 92.48951 110.08069 122.30210 92.57239 85.95866
## [281] 98.18966 84.32650 106.43415 87.21512 82.72021 95.34278 86.77822
## [288] 85.83783 116.38972 99.28426 93.83675 118.27549 110.11913 97.68804
## [295] 114.23996 89.34120 96.24621 105.09426 113.73694 97.94223 108.28720
## [302] 91.31161 90.40705 115.66519 101.76123 88.27380 87.75419 93.83341
## [309] 93.13289 106.02890 85.95308 134.80789 131.86157 92.09905 99.12375
## [316] 95.39860 110.30395 104.98861 88.23982 101.01743 104.28297 93.77874
## [323] 94.14459 105.35033 115.65281 99.59377 82.81441 105.52633 96.67919
## [330] 108.88161 77.16142 78.54104 79.22505 94.37643 102.11847 111.32246
## [337] 94.36270 96.97566 85.24018 104.78906 97.94331 98.47267 112.12429
## [344] 92.61750 110.88489 101.03955 97.93342 92.17250 85.71711 117.37061
## [351] 99.72105 132.46391 94.70294 95.14506 107.49577 104.27257 90.68824
## [358] 88.62163 104.65549 119.50163 106.40543 84.73837 109.17413 84.15237
## [365] 95.54936 98.05938 132.53158 68.82904 87.66846 118.05574 91.71375
## [372] 67.96826 107.41460 97.87373 105.31923 80.95961 97.33772 80.36529
## [379] 112.38546 86.59786 105.00567 97.36861 82.49782 96.07043 118.35056
## [386] 93.94728 101.71911 106.20520 81.08899 92.59505 100.64291 112.08866
## [393] 93.45028 109.75790 72.82524 91.24777 90.25758 73.97354 103.47414
## [400] 77.61856 84.03092 107.75255 113.78484 104.62652 117.15071 114.80919
## [407] 137.98082 110.59706 94.91962 102.23369 86.85678 109.81449 101.11100
## [414] 89.46308 91.97314 123.89582 86.76664 97.28726 105.39787 89.25729
## [421] 90.32135 111.45630 86.90132 133.27983 98.23244 85.31537 100.19737
## [428] 105.90232 86.66599 115.89353 78.59628 75.02612 94.49608 118.90503
## [435] 79.52160 71.37334 87.51811 102.53263 114.96237 110.70180 95.25843
## [442] 129.12940 82.55699 100.46562 121.82810 104.24324 99.90609 109.26093
## [449] 109.41585 94.41526 84.71457 108.14125 94.26464 111.59519 126.75281
## [456] 106.90196 102.07434 93.98290 93.38972 91.88619 113.91266 92.84520
## [463] 106.51241 103.19175 71.67833 105.17094 130.63475 111.22524 95.52833
## [470] 94.12342 99.57447 100.85598 86.21634 131.49171 120.62526 117.80634
## [477] 87.28620 106.69107 113.44171 111.30775 88.27667 82.99422 92.72751
## [484] 103.19152 106.25205 95.93964 80.93012 130.20918 102.27484 96.26937
## [491] 142.20984 88.98333 82.95141 109.32908 106.85959 116.98210 102.74353
## [498] 90.12450 85.70101 96.12652 109.84851 103.31080 108.29702 103.55035
## [505] 89.82988 117.67039 89.71113 141.30428 102.46201 119.91545 103.01418
## [512] 98.05325 103.16767 95.48293 116.96211 97.73587 90.45874 90.61735
## [519] 83.81840 97.52111 87.56180 92.65242 89.23375 79.53466 72.96877
## [526] 76.49870 111.71806 93.36781 86.60616 102.66398 102.90071 87.75327
## [533] 89.89439 110.32903 101.57247 101.25437 95.08865 97.99124 93.04677
## [540] 105.99430 101.88489 73.21543 106.70097 79.96514 96.34395 105.94350
## [547] 111.42355 107.99868 100.44958 86.29837 77.90278 84.18710 91.26691
## [554] 104.44486 98.80524 92.07447 103.77079 95.29369 83.07129 85.24898
## [561] 98.91295 103.10857 106.42158 92.04459 99.93997 94.16863 92.91681
## [568] 92.79367 104.05057 81.27089 103.00006 110.25204 86.97453 140.14333
## [575] 110.09522 149.55295 99.18712 117.77018 113.87575 86.22295 84.53492
## [582] 140.09223 80.23610 97.65483 108.10349 110.56878 109.72795 91.21321
## [589] 108.26938 88.82868 97.58058 124.49451 103.33306 102.93928 103.68627
## [596] 111.27140 92.47717 71.54999 68.30266 122.89580 84.98147 95.08537
## [603] 106.39000 90.52691 124.05887 100.45574 99.36465 80.10139 82.82059
## [610] 101.19983 87.16898 92.66655 92.07438 101.80824 102.52573 116.85667
## [617] 96.81863 80.17619 125.31161 124.11204 91.26298 102.42924 102.77127
## [624] 123.33079 119.38507 93.82931 100.68158 127.83000 95.29952 101.71570
## [631] 108.99551 110.18170 108.29682 109.97555 80.13238 114.48197 106.23598
## [638] 99.30560 109.23232 107.86591 95.99627 66.90164 107.15087 117.56764
## [645] 95.87239 99.58499 102.24754 93.91136 112.61973 112.84287 100.28661
## [652] 131.12363 99.14775 119.88972 93.52245 105.85057 71.26657 99.34411
## [659] 94.00553 91.62109 90.81493 116.73781 107.61680 134.51954 118.33894
## [666] 109.17801 93.37972 80.78963 101.13201 88.33514 98.43215 90.92413
## [673] 102.91250 103.94530 104.89239 109.51490 102.45605 99.59301 88.75336
## [680] 111.78730 107.36964 104.10790 108.16049 97.15058 106.47718 81.61077
## [687] 93.94158 110.28502 92.77437 105.59221 81.78614 83.09844 111.78681
## [694] 103.44578 87.05692 97.58173 106.20597 102.81502 100.14797 129.29359
## [701] 102.78535 134.52996 103.23803 89.23380 99.30458 115.49521 119.08743
## [708] 87.41736 106.19902 114.96686 93.63540 82.63304 96.38022 89.16272
## [715] 116.08902 106.88286 111.36420 106.29197 110.79614 108.96047 126.34226
## [722] 90.11516 131.63751 98.79598 86.11180 86.69136 102.70482 88.46339
## [729] 114.33808 94.86442 99.40901 103.99556 88.04640 115.04543 110.62126
## [736] 75.91159 99.01641 78.12995 76.55909 118.42646 98.79479 101.52769
## [743] 111.21974 90.17961 102.09604 99.18030 118.45488 96.33281 97.30289
## [750] 107.81429 118.53218 121.62456 107.33683 97.95464 108.09599 94.86954
## [757] 127.84911 86.65321 92.89539 109.99302 87.71453 100.32872 120.77091
## [764] 106.21770 88.67180 105.59745 123.91861 121.76138 109.45543 87.29483
## [771] 140.30911 96.62310 86.29664 86.32731 88.90527 117.99440 113.08237
## [778] 102.02438 98.81798 120.12142 102.55900 90.18104 103.66985 95.42713
## [785] 76.19061 90.12125 90.08163 101.48788 119.05045 77.94110 80.77187
## [792] 96.90511 87.07511 103.86231 74.68653 112.60492 111.70109 112.13339
## [799] 97.34086 97.16948 102.66192 90.51406 105.19919 96.49342 96.85392
## [806] 105.80258 99.02701 108.01814 84.74296 103.87043 95.37539 128.73740
## [813] 97.20202 86.53338 86.48398 106.55412 98.04686 83.51767 88.09206
## [820] 100.21649 117.97879 103.49657 98.88599 90.78284 107.51189 113.85407
## [827] 90.62667 89.00409 117.65644 81.91938 82.13353 89.76232 110.26529
## [834] 98.95940 98.78905 98.32357 90.45355 90.95602 106.25293 91.57631
## [841] 92.95336 113.84132 110.81600 107.20375 112.21967 114.56182 89.50202
## [848] 103.27642 96.91910 83.90832 138.27276 92.48370 87.51808 134.36444
## [855] 86.96423 88.06919 94.39880 109.74516 102.93710 112.95168 93.53859
## [862] 92.22776 98.74434 74.47272 96.31675 82.66148 121.00624 103.25672
## [869] 82.02852 126.65983 81.33713 110.05960 103.63864 90.30336 99.99469
## [876] 85.76242 106.30419 104.27716 89.28885 105.12334 86.31607 94.21505
## [883] 85.55419 96.66701 93.54439 90.09010 133.93919 116.32783 89.79148
## [890] 104.65716 115.50019 95.57168 95.74042 95.81791 107.13785 108.98565
## [897] 96.09153 110.09904 109.93073 102.13408 104.35366 125.75633 108.00095
## [904] 119.69912 106.21025 95.85990 123.69424 108.91040 84.72641 90.40928
## [911] 116.79424 74.68891 95.31744 112.78691 110.84211 113.10626 130.91303
## [918] 80.15320 90.60672 74.43497 135.48271 98.42966 106.68668 109.71477
## [925] 103.92461 92.64505 93.59235 104.08821 103.51648 91.75358 96.92633
## [932] 124.48836 98.15869 74.31321 81.41446 91.18166 96.67804 111.74737
## [939] 132.47768 104.89508 107.72754 113.34083 95.84822 107.03621 117.11172
## [946] 92.02580 87.39153 107.47717 83.33146 103.50323 102.55630 97.25066
## [953] 88.45252 111.81069 97.55445 93.70770 92.01069 102.26630 114.31766
## [960] 80.01533 87.27592 85.46819 88.91451 95.29368 100.79865 74.84715
## [967] 95.53612 114.98630 122.74582 104.13728 83.97891 111.10306 92.35974
## [974] 115.75341 100.77157 97.98733 106.67828 77.47949 114.42733 93.87777
## [981] 108.09538 96.48020 90.27027 107.26490 115.97458 136.82657 91.44925
## [988] 118.01160 102.06270 92.12152 83.16737 85.89726 95.53821 90.76898
## [995] 92.22294 110.95742 88.45593 117.11183 93.40461 101.63439
plot(apply(my_df, 2, var), type = "o", pch = 21, bg = "orange")